Goto

Collaborating Authors

 graph-based protein design


Generative Models for Graph-Based Protein Design

Neural Information Processing Systems

Engineered proteins offer the potential to solve many problems in biomedicine, energy, and materials science, but creating designs that succeed is difficult in practice. A significant aspect of this challenge is the complex coupling between protein sequence and 3D structure, with the task of finding a viable design often referred to as the inverse protein folding problem. We develop relational language models for protein sequences that directly condition on a graph specification of the target structure. Our approach efficiently captures the complex dependencies in proteins by focusing on those that are long-range in sequence but local in 3D space. Our framework significantly improves in both speed and robustness over conventional and deep-learning-based methods for structure-based protein sequence design, and takes a step toward rapid and targeted biomolecular design with the aid of deep generative models.


Reviews: Generative Models for Graph-Based Protein Design

Neural Information Processing Systems

This paper addresses the problem of generation of protein sequences for a desired 3D structure, also known as the "inverse protein folding problem". The authors introduce a model inspired by recent advances in language modeling (for the sequence decoder part of the model) and graph representation learning (for the encoder part of the model). Protein structures are represented as k-NN graphs, enriched with orientation/location-based features and features based on structural bindings. The encoder takes the form of an adapted Graph Attention Network, here termed "Structured Transformer", which is enriched with edge features and relative positional encodings, and the decoder takes the form of an auto-regressive Transformer-based model. Results indicate improvements over a recent deep neural network baseline for this task.


Reviews: Generative Models for Graph-Based Protein Design

Neural Information Processing Systems

There is a consensus among reviewers that this paper deals with a high impact problem, makes progress on solving it as compared to state-of-the-art, and is well written.


Generative Models for Graph-Based Protein Design

Neural Information Processing Systems

Engineered proteins offer the potential to solve many problems in biomedicine, energy, and materials science, but creating designs that succeed is difficult in practice. A significant aspect of this challenge is the complex coupling between protein sequence and 3D structure, with the task of finding a viable design often referred to as the inverse protein folding problem. We develop relational language models for protein sequences that directly condition on a graph specification of the target structure. Our approach efficiently captures the complex dependencies in proteins by focusing on those that are long-range in sequence but local in 3D space. Our framework significantly improves in both speed and robustness over conventional and deep-learning-based methods for structure-based protein sequence design, and takes a step toward rapid and targeted biomolecular design with the aid of deep generative models.


Generative Models for Graph-Based Protein Design

Neural Information Processing Systems

Engineered proteins offer the potential to solve many problems in biomedicine, energy, and materials science, but creating designs that succeed is difficult in practice. A significant aspect of this challenge is the complex coupling between protein sequence and 3D structure, with the task of finding a viable design often referred to as the inverse protein folding problem. We develop relational language models for protein sequences that directly condition on a graph specification of the target structure. Our approach efficiently captures the complex dependencies in proteins by focusing on those that are long-range in sequence but local in 3D space. Our framework significantly improves in both speed and robustness over conventional and deep-learning-based methods for structure-based protein sequence design, and takes a step toward rapid and targeted biomolecular design with the aid of deep generative models.